Media content items sequencing

ABSTRACT

A media content item sequencing system determines a sequence for playback of selected media content items, such as media content items in a playlist. The system calculates similarities between all possible pairs of the media content items and determines a sequence of the media content items using the similarities. The sequence of media content items can be determined by modeling the track features of the media content items with a graphic traversal problem and calculating a solution to the problem with various methods.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims priority to U.S. Application Ser. No. 62/313,636filed on Mar. 25, 2016 and entitled SYSTEM AND METHOD FOR AUTOMATIC ANDSCALABLE PLAYLIST SEQUENCING AND TRANSITIONS, the disclosure of which ishereby incorporated by reference in its entirety.

BACKGROUND

Media content, such as audio content or video content, is widelyconsumed in various environments, such as daily, recreation, or fitnessactivities. Examples of audio content include songs, albums, podcasts,audiobooks, etc. Examples of video content include movies, music videos,television episodes, etc. Using a mobile phone or other media playbackdevice a person can access large catalogs of media content. For example,a user can access an almost limitless catalog of media content throughvarious free and subscription-based streaming services. Additionally, auser can store a large catalog of media content on his or her mobiledevice.

This nearly limitless access to media content introduces new challengesfor users. For example, it may be difficult to find or select the rightmedia content that complements a particular moment such as running orother repetitive-motion activity. Further, it is desirable to play aseries of media content items to create engaging, seamless, and cohesivelistening experiences, which can be provided by professional musiccurators and DJs who carefully sort and mix tracks together. Averagelisteners typically lack the time and skill required to craft such anexperience for their own personal enjoyment.

SUMMARY

In general terms, this disclosure is directed to systems and methods formanaging a sequence between media content items. In one possibleconfiguration and by non-limiting example, the systems and methods use aplurality of track features of media content items and determine asequence of media content items based on similarities of the trackfeatures thereof. Various aspects are described in this disclosure,which include, but are not limited to, the following aspects.

One aspect is a method for playing media content items. The methodincludes determining a plurality of track features of each of the mediacontent items; obtaining weighting data for the plurality of trackfeatures; generating a plurality of weighted track features for each ofthe media content items by applying the weighting data to the pluralityof track features of each of the media content items; calculatingaggregated track features for the media content items, respectively,based on the plurality of weighted track features; comparing theaggregated track features to determine similarities between theaggregated track features; and determining a sequence of the mediacontent items based on the similarities.

Another aspect is a method for sequencing media content items. Themethod comprising determining a plurality of track features of each ofthe media content items; weighting the plurality of track features;mapping the plurality of weighted track features of each of the mediacontent items to an aggregated feature vector; determining similaritiesamong the aggregated feature vectors; and determining a sequence of themedia content items based on the similarities.

Yet another aspect is a computer readable storage device storing datainstructions that when executed by a processing device causes theprocessing device to: determine a plurality of track features of each ofthe media content items; weight the plurality of track features; map theplurality of weighted track features of each of the media content itemsto an aggregated feature vector; determine similarities among theaggregated feature vectors; and determine a sequence of the mediacontent items based on the similarities.

Another aspect is a system comprising: at least one processing device;and at least one computer readable storage device storing datainstructions, which when executed by the at least one processing device,cause the at least one processing device to: determine a plurality oftrack features of each of the media content items; weight the pluralityof track features; map the plurality of weighted track features of eachof the media content items to an aggregated feature vector; determinesimilarities among the aggregated feature vectors; and determine asequence of the media content items based on the similarities.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an example system for automatically sequencing andplaying media content items.

FIG. 2 is a schematic illustration of an example system forautomatically sequencing and playing media content items.

FIG. 3 illustrates an example method for automatically sequencing mediacontent items.

FIG. 4 illustrates example track features.

FIG. 5 illustrates an example method for obtaining weighting data.

FIG. 6 illustrates an example user interface for receiving a user inputof weighting.

FIG. 7 illustrates another method for obtaining weighting data.

FIG. 8 illustrates an example table for showing track features andaggregated track feature for each media content item.

FIG. 9 is an example table showing sequencing of the media content items116 based on aggregated track features.

FIG. 10 illustrates another example method for automatically sequencingmedia content items.

FIG. 11 is a diagram illustrating operations in the method of FIG. 10.

FIG. 12 illustrates example mapping of key and mode information in athree dimensional space.

FIG. 13 illustrates example mapping of tempo in a binary logarithmicscale.

FIG. 14 illustrates an example method for determining similaritiesbetween media content items.

FIG. 15 illustrates an example method for determining a sequence ofmedia content items.

FIG. 16 is an example graph for determining a sequence of media contentitems.

FIG. 17 illustrates an example system for managing a sequence betweenmedia content items to continuously support a repetitive motionactivity.

FIG. 18 illustrates an example of the media delivery system of FIG. 17for managing a sequence between media content items to continuouslysupport a repetitive motion activity.

DETAILED DESCRIPTION

Various embodiments will be described in detail with reference to thedrawings, wherein like reference numerals represent like parts andassemblies throughout the several views. Reference to variousembodiments does not limit the scope of the claims attached hereto.Additionally, any examples set forth in this specification are notintended to be limiting and merely set forth some of the many possibleembodiments for the appended claims.

In general, the system of the present disclosure determines a sequencefor playback of selected media content items, such as media contentitems in a playlist. For a given set of media content items (forexample, in the form of a playlist), the system calculates similaritiesbetween all possible pairs of the media content items, and determines asequence of the media content items using the similarities. Each of thesimilarities can be calculated by comparing track features of two mediacontent items. In some embodiments, such track features can berepresented as numerical values. In other embodiments, the trackfeatures can be represented as a vector. The sequence of media contentitems can be determined by modeling the track features of the mediacontent items with a graphic traversal problem and calculating asolution to the problem with various methods.

In certain examples, the system of the present disclosure is used toplay back a plurality of media content items to continuously support auser's repetitive motion activity without distracting the user'scadence.

As such, the system provides a simple, efficient solution to sequencingof selected media content items with professional-level quality. Incertain examples, the management process for sequencing between mediacontent items is executed in a server computing device, rather than auser's media playback device. Accordingly, the media playback device cansave its resources for playing back media content items in a desirablesequence, and the management process can be efficiently maintained andconveniently modified as appropriate without interacting with the mediaplayback device.

FIG. 1 illustrates an example system 100 for automatically sequencingand playing media content items. In this example, the system 100includes a media playback device 102 and a media delivery system 104.The system 100 communicates across a network 106. In some embodiments, amedia content sequencing engine 110 runs on the media playback device102, and a media content sequence determination engine 112 runs on themedia delivery system 104. Also shown is a user U who uses the mediaplayback device 102 to play back a set of media content items in aplaylist 114.

The media playback device 102 operates to play media content items toproduce media output 108. In some embodiments, the media content itemsare provided by the media delivery system 104 and transmitted to themedia playback device 102 using the network 106. A media content item isan item of media content, including audio, video, or other types ofmedia content, which may be stored in any format suitable for storingmedia content. Non-limiting examples of media content items includesongs, albums, music videos, movies, television episodes, podcasts,other types of audio or video content, and portions or combinationsthereof. In this document, the media content items can also be referredto as tracks.

The media delivery system 104 operates to provide media content items tothe media playback device 102. In some embodiments, the media deliverysystem 104 are connectable to a plurality of media playback devices 102and provide media content items to the media playback devices 102independently or simultaneously.

The media content sequencing engine 110 operates to play media contentitems in a desirable sequence. In some embodiments, a sequence of themedia content items are determined by the media delivery system 104 andthe media playback device 102 merely operates to play back the mediacontent items according to the sequence. In other embodiments, the mediacontent sequencing engine 110 operates to determine such a sequence ofthe media content items, either independently or in cooperation with themedia delivery system 104 including the media content sequencedetermination engine 112.

In some embodiments, as illustrated in FIGS. 17 and 18, the system 100operates to play media content items in such a sequence as tocontinuously support the user's repetitive motion activity withoutinterruption.

The media content sequence determination engine 112 operates todetermine a sequence of media content items which are played. In someembodiments, a sequence of the media content items are determined by themedia delivery system 104, either independently or in cooperation withthe media playback device 102 including the media content sequencingengine 110. As described herein, in some embodiments, the media contentsequence determination engine 112 operates to determine a sequence ofmedia content items where a group of the media content items are givento be played on the media playback device 102. Such a group of mediacontent items can be provided in the form of a playlist 114, which canbe manually selected by the user and/or automatically populated for theuser. In other embodiments, the sequencing can be determined for othermedia content items stored in either or both of the media playbackdevice 102 and the media delivery system 104.

FIG. 2 is a schematic illustration of an example system 100 forautomatically sequencing and playing media content items. As alsoillustrated in FIG. 1, the system 100 can include the media playbackdevice 102, the media delivery system 104, and the network 106.

As described herein, the media playback device 102 operates to playmedia content items. In some embodiments, the media playback device 102operates to play media content items that are provided (e.g., streamed,transmitted, etc.) by a system external to the media playback devicesuch as the media delivery system 104, another system, or a peer device.Alternatively, in some embodiments, the media playback device 102operates to play media content items stored locally on the mediaplayback device 102. Further, in at least some embodiments, the mediaplayback device 102 operates to play media content items that are storedlocally as well as media content items provided by other systems.

In some embodiments, the media playback device 102 is a computingdevice, handheld entertainment device, smartphone, tablet, watch,wearable device, or any other type of device capable of playing mediacontent. In yet other embodiments, the media playback device 102 is alaptop computer, desktop computer, television, gaming console, set-topbox, network appliance, blue-ray or DVD player, media player, stereo, orradio.

In at least some embodiments, the media playback device 102 includes alocation-determining device 130, a touch screen 132, a processing device134, a memory device 136, a content output device 138, and a networkaccess device 140. Other embodiments may include additional, different,or fewer components. For example, some embodiments may include arecording device such as a microphone or camera that operates to recordaudio or video content. As another example, some embodiments do notinclude one or more of the location-determining device 130 and the touchscreen 132.

The location-determining device 130 is a device that determines thelocation of the media playback device 102. In some embodiments, thelocation-determining device 130 uses one or more of the followingtechnologies: Global Positioning System (GPS) technology which mayreceive GPS signals from satellites S, cellular triangulationtechnology, network-based location identification technology, Wi-Fipositioning systems technology, and combinations thereof.

The touch screen 132 operates to receive an input from a selector (e.g.,a finger, stylus etc.) controlled by the user U. In some embodiments,the touch screen 132 operates as both a display device and a user inputdevice. In some embodiments, the touch screen 132 detects inputs basedon one or both of touches and near-touches. In some embodiments, thetouch screen 132 displays a user interface 144 for interacting with themedia playback device 102. As noted above, some embodiments do notinclude a touch screen 132. Some embodiments include a display deviceand one or more separate user interface devices. Further, someembodiments do not include a display device.

In some embodiments, the processing device 134 comprises one or morecentral processing units (CPU). In other embodiments, the processingdevice 134 additionally or alternatively includes one or more digitalsignal processors, field-programmable gate arrays, or other electroniccircuits.

The memory device 136 operates to store data and instructions. In someembodiments, the memory device 136 stores instructions for a mediaplayback engine 146 that includes a media content selection engine 148and the media content sequencing engine 110.

The memory device 136 typically includes at least some form ofcomputer-readable media. Computer readable media include any availablemedia that can be accessed by the media playback device 102. By way ofexample, computer-readable media include computer readable storage mediaand computer readable communication media.

Computer readable storage media includes volatile and nonvolatile,removable and non-removable media implemented in any device configuredto store information such as computer readable instructions, datastructures, program modules, or other data. Computer readable storagemedia includes, but is not limited to, random access memory, read onlymemory, electrically erasable programmable read only memory, flashmemory and other memory technology, compact disc read only memory, blueray discs, digital versatile discs or other optical storage, magneticcassettes, magnetic tape, magnetic disk storage or other magneticstorage devices, or any other medium that can be used to store thedesired information and that can be accessed by the media playbackdevice 102. In some embodiments, computer readable storage media isnon-transitory computer readable storage media.

Computer readable communication media typically embodies computerreadable instructions, data structures, program modules or other data ina modulated data signal such as a carrier wave or other transportmechanism and includes any information delivery media. The term“modulated data signal” refers to a signal that has one or more of itscharacteristics set or changed in such a manner as to encode informationin the signal. By way of example, computer readable communication mediaincludes wired media such as a wired network or direct-wired connection,and wireless media such as acoustic, radio frequency, infrared, andother wireless media. Combinations of any of the above are also includedwithin the scope of computer readable media.

The content output device 138 operates to output media content. In someembodiments, the content output device 138 generates media output 108(FIG. 1) for the user U. Examples of the content output device 138include a speaker, an audio output jack, a Bluetooth transmitter, adisplay panel, and a video output jack. Other embodiments are possibleas well. For example, the content output device 138 may transmit asignal through the audio output jack or Bluetooth transmitter that canbe used to reproduce an audio signal by a connected or paired devicesuch as headphones or a speaker.

The network access device 140 operates to communicate with othercomputing devices over one or more networks, such as the network 106.Examples of the network access device include wired network interfacesand wireless network interfaces. Wireless network interfaces includesinfrared, BLUETOOTH® wireless technology, 802.11a/b/g/n/ac, and cellularor other radio frequency interfaces in at least some possibleembodiments.

The media playback engine 146 operates to play back one or more of themedia content items (e.g., music) to the user U. When the user U isrunning while using the media playback device 102, the media playbackengine 146 can operate to play media content items to encourage therunning of the user U, as illustrated with respect to FIG. 22. Asdescribed herein, the media playback engine 146 is configured tocommunicate with the media delivery system 104 to receive one or moremedia content items (e.g., through the stream media 180), as well assequencing data generated by the media delivery system 104 forsequencing media content items. Alternatively, such sequencing data canbe locally generated by, for example, the media playback device 102.

The media content selection engine 148 operates to retrieve one or moremedia content items. In some embodiments, the media content selectionengine 148 is configured to send a request to the media delivery system104 for media content items and receive information about such mediacontent items for playback. In some embodiments, media content items canbe stored in the media delivery system 104. In other embodiments, mediacontent items can be stored locally in the media playback device 102. Inyet other embodiments, some media content items can be stored locally inthe media playback device 102 and other media content items can bestored in the media delivery system 104.

The media content sequencing engine 110 is included in the mediaplayback engine 146 in some embodiments. The media content sequencingengine 110, either independently or in cooperation with the mediacontent sequence determination engine 112, can operate to arrangesimilar media content items closely so as to provide engaging, seamlessand cohesive listening experiences which would otherwise be manuallyperformed by music professionals, such as disc jockeys. Such sequencingcan be performed by the media content sequence determination engine 112of the media delivery system 104 alone. As described herein, such asequence of media content items can also support a user's repetitivemotion activity.

With still reference to FIG. 2, the media delivery system 104 includesone or more computing devices and operates to provide media contentitems to the media playback devices 102 and, in some embodiments, othermedia playback devices as well. In some embodiments, the media deliverysystem 104 operates to transmit stream media 180 to media playbackdevices such as the media playback device 102.

In some embodiments, the media delivery system 104 includes a mediaserver application 150, a processing device 152, a memory device 154,and a network access device 156. The processing device 152, memorydevice 154, and network access device 156 may be similar to theprocessing device 134, memory device 136, and network access device 140respectively, which have each been previously described.

In some embodiments, the media server application 150 operates to streammusic or other audio, video, or other forms of media content. The mediaserver application 150 includes a media stream service 160, a media datastore 162, and a media application interface 164.

The media stream service 160 operates to buffer media content such asmedia content items 170 (including 170A, 170B, and 170Z) for streamingto one or more streams 172A, 172B, and 172Z.

The media application interface 164 can receive requests or othercommunication from media playback devices or other systems, to retrievemedia content items from the media delivery system 104. For example, inFIG. 2, the media application interface 164 receives communication 182from the media playback engine 146.

In some embodiments, the media data store 162 stores media content items170, media content metadata 174, and playlists 176. The media data store162 may comprise one or more databases and file systems. Otherembodiments are possible as well. As noted above, the media contentitems 170 may be audio, video, or any other type of media content, whichmay be stored in any format for storing media content.

The media content metadata 174 operates to provide various pieces ofinformation associated with the media content items 170. In someembodiments, the media content metadata 174 includes one or more oftitle, artist name, album name, length, genre, mood, era, etc.

In some embodiments, the media content metadata 174 includes acousticmetadata, cultural metadata, and explicit metadata. The acousticmetadata may be derived from analysis of the track refers to a numericalor mathematical representation of the sound of a track. Acousticmetadata may include temporal information such as tempo, rhythm, beats,downbeats, tatums, patterns, sections, or other structures. Acousticmetadata may also include spectral information such as melody, pitch,harmony, timbre, chroma, loudness, vocalness, or other possiblefeatures. Acoustic metadata may take the form of one or more vectors,matrices, lists, tables, and other data structures. Acoustic metadatamay be derived from analysis of the music signal. One form of acousticmetadata, commonly termed an acoustic fingerprint, may uniquely identifya specific track. Other forms of acoustic metadata may be formed bycompressing the content of a track while retaining some or all of itsmusical characteristics.

The cultural metadata refers to text-based information describinglisteners' reactions to a track or song, such as styles, genres, moods,themes, similar artists and/or songs, rankings, etc. Cultural metadatamay be derived from expert opinion such as music reviews orclassification of music into genres. Cultural metadata may be derivedfrom listeners through websites, chatrooms, blogs, surveys, and thelike. Cultural metadata may include sales data, shared collections,lists of favorite songs, and any text information that may be used todescribe, rank, or interpret music. Cultural metadata may also begenerated by a community of listeners and automatically retrieved fromInternet sites, chat rooms, blogs, and the like. Cultural metadata maytake the form of one or more vectors, matrices, lists, tables, and otherdata structures. A form of cultural metadata particularly useful forcomparing music is a description vector. A description vector is amulti-dimensional vector associated with a track, album, or artist. Eachterm of the description vector indicates the probability that acorresponding word or phrase would be used to describe the associatedtrack, album or artist.

The explicit metadata refers to factual or explicit information relatingto music. Explicit metadata may include album and song titles, artistand composer names, other credits, album cover art, publisher name andproduct number, and other information. Explicit metadata is generallynot derived from the music itself or from the reactions or opinions oflisteners.

At least some of the metadata 174, such as explicit metadata (names,credits, product numbers, etc.) and cultural metadata (styles, genres,moods, themes, similar artists and/or songs, rankings, etc.), for alarge library of songs or tracks can be evaluated and provided by one ormore third party service providers. Acoustic and cultural metadata maytake the form of parameters, lists, matrices, vectors, and other datastructures. Acoustic and cultural metadata may be stored as XML files,for example, or any other appropriate file type. Explicit metadata mayinclude numerical, text, pictorial, and other information. Explicitmetadata may also be stored in an XML or other file. All or portions ofthe metadata may be stored in separate files associated with specifictracks. All or portions of the metadata, such as acoustic fingerprintsand/or description vectors, may be stored in a searchable datastructure, such as a k-D tree or other database format.

The playlists 176, which includes the playlist 114 (FIG. 1), operate toidentify one or more of the media content items 170. In someembodiments, the playlists 176 identify a group of the media contentitems 170 in a particular order. In other embodiments, the playlists 176merely identify a group of the media content items 170 withoutspecifying a particular order. Some, but not necessarily all, of themedia content items 170 included in a particular one of the playlists176 are associated with a common characteristic such as a common genre,mood, or era.

In some embodiments, playlists can be manually created, modified, andmanaged by users. In other embodiments, playlists can be automaticallycreated by the media delivery system 104, the media playback device 102,and any other computing devices and presented or recommended to theusers.

Referring still to FIG. 2, the network 106 is an electroniccommunication network that facilitates communication between the mediaplayback device 102 and the media delivery system 104. An electroniccommunication network includes a set of computing devices and linksbetween the computing devices. The computing devices in the network usethe links to enable communication among the computing devices in thenetwork. The network 106 can include routers, switches, mobile accesspoints, bridges, hubs, intrusion detection devices, storage devices,standalone server devices, blade server devices, sensors, desktopcomputers, firewall devices, laptop computers, handheld computers,mobile telephones, and other types of computing devices.

In various embodiments, the network 106 includes various types of links.For example, the network 106 can include wired and/or wireless links,including Bluetooth, ultra-wideband (UWB), 802.11, ZigBee, cellular, andother types of wireless links. Furthermore, in various embodiments, thenetwork 106 is implemented at various scales. For example, the network106 can be implemented as one or more local area networks (LANs),metropolitan area networks, subnets, wide area networks (such as theInternet), or can be implemented at another scale. Further, in someembodiments, the network 106 includes multiple networks, which may be ofthe same type or of multiple different types.

Although FIG. 2 illustrates only a single media playback device 102communicable with a single media delivery system 104, in accordance withsome embodiments, the media delivery system 104 can support thesimultaneous use of multiple media playback devices, and the mediaplayback device can simultaneously access media content from multiplemedia delivery systems. Additionally, although FIG. 2 illustrates astreaming media based system for managing sequencing of media contentitems, other embodiments are possible as well. For example, in someembodiments, the media playback device 102 includes a media data store162 and the media playback device 102 is configured to performmanagement of sequencing between media content items without accessingthe media delivery system 104. Further in some embodiments, the mediaplayback device 102 operates to store previously streamed media contentitems in a local media data store.

FIG. 3 illustrates an example method 200 for automatically sequencingmedia content items. In this example, the method 200 is described asbeing performed in the media delivery system 104 including the mediacontent sequence determination engine 112. However, in otherembodiments, only some of the processes in the method 200 can beperformed by the media delivery system 104. In other embodiments, all orsome of the processes in the method 200 are performed by the mediaplayback device 102. In yet other embodiments, all or some of theprocesses in the method 200 are performed by both of the media deliverysystem 104 and the media playback device 102 in cooperation.

Within this description, the terms “automatically” and “automated” mean“without user intervention”. An automated task may be initiated by auser but an automated task, once initiated, proceeds to a conclusionwithout further user action.

Within this description, a “track” is a digital data file containingaudio information. A track may be stored on a storage device such as ahard disc drive, and may be a component of a library of audio tracks. Atrack may be a recording of a song or a section, such as a movement, ofa longer musical composition. A track may be stored in any known orfuture audio file format. A track may be stored in an uncompressedformat, such as a WAV file, or a compressed format such as an MP3 file.In this document, however, a track is not limited to be of audio typeand it is also understood that a track can indicate a media content itemof any suitable type.

The method 200 can begin at operation 202, in which the media deliverysystem 104 receives selection of media content items. In someembodiments, the media content items to be sequenced are identified in aplaylist 114 (FIG. 1). The media content items to be sequenced can bemanually selected by the user or automatically provided to the user.

At operation 204, the media delivery system 104 determines one or moretrack features of each media content item. Track features representvarious characteristics of a media content item in various forms. Insome embodiments, track features can be obtained from various sources,such as the media content metadata 174 including acoustic metadata,cultural metadata, and explicit metadata. In other embodiments, trackfeatures can be obtained by retrieving the media content metadata 174and processing it to different formats. Example track features which canbe used for sequencing are further described with reference to FIG. 4.

At operation 206, the media delivery system 104 obtains weighting data.At operation 208, the media delivery system 104 then weights the trackfeatures based on the weighting data, thereby generating weighted trackfeatures for each media content item.

Track features and weighted track features can be represented in variousformats. In some embodiments, track features and weighted track featurescan be represented by a numerical value or score, as illustrated in FIG.8. In other embodiments, track features and weighted track features canbe represented as vectors, such as feature vectors 376, as illustratedin FIG. 11. Other forms are also possible to represent track featuresand weighted track features in yet other embodiments.

The weighting data, such as weighting data 380 (FIG. 11), includeinformation usable to weight (also referred to herein as scale)different track features. As described herein, the track features usedfor sequencing can be scaled such that the selected media content itemsare ordered to flow smoothly from one item to the next. The notion of“flowing smoothly” can be content dependent. For example, somesituations require that the tempo doesn't change abruptly while othersituations require that neighboring tracks are acoustically similar. Byway of example, desirable sequencing can ensure that consecutive pairsof media content items have similar keys and tempos, allowing for lessjarring transitions.

In some embodiments, the track features used for sequencing can beweighted in a way that is consistent with intended applications. By wayof example, a generic playlist of media content items can be sequencedusing only timbral descriptors, while tempo and key consistency may bethe most important aspects in the case of a dance party playlist wherethe crossfade between media content items should preserve the rhythmicregularity and harmonic flow. As such, the track features can beweighted differently according to various factors which may determinethe characteristics of the set (e.g., playlist) of media content itemsto be sequenced.

In some embodiments, weighting information included in the weightingdata can be selected or adjusted manually by a user, as furtherillustrated in FIG. 5. Alternatively or in addition, such weights can beautomatically determined as further illustrated in FIG. 7.

At operation 210, the media delivery system 104 calculates an aggregatedtrack feature for each media content item based on the weighted trackfeatures for that media content item. In some embodiments, theaggregated track feature for each media content items, such as anaggregated track feature 302 (FIG. 8), can be determined as a sum of theweighted track features that are obtained at the operation 208. In otherembodiments, the aggregated track feature can be obtained by using theweighted track features differently.

The aggregated track feature can be represented in various formats. Insome embodiments, the aggregated track feature can be represented by anumerical value or score, as illustrated in FIG. 8. In otherembodiments, the aggregated track feature can be represented as avector, such as an aggregated feature vector 378, as illustrated in FIG.11. Other forms are also possible to represent the aggregated trackfeature in yet other embodiments.

In some embodiments, the operation 210 can be repeated until theaggregated track features are obtained for all of the media contentitems to be sequenced.

At operation 212, the media delivery system 104 compares the aggregatedtrack features. At operation 214, the media delivery system 104determines similarities between the media content items based on thecomparison between the media content items' aggregated track features.

A similarity between media content items can be calculated in variousways. In some embodiments, where aggregated track features arerepresented as numerical values, a similarity between two media contentitems can be determined based on a difference between the aggregatedtrack feature values of the two media content items. In otherembodiments, where aggregated track features are represented as vectors,a similarity between two media content items can be determined bycalculating the Euclidean distance between the vectors representative ofthe aggregated track features of the two media content items. In yetother embodiments, any other similarity or comparison measurement can beused to compare two media content items.

A similarity can be represented in various formats. In some embodiments,a similarity result can be a value indicating the similarity between twomedia content items on a predetermined scale. For example, a similaritycan be a score having a value between 0 and 1, 0 and 100, etc., with 0indicating no similarity between two media content items and the maximumvalue indicating that two media content items are highly similar oridentical. The similarity result may be expressed as a difference score,where zero may indicate no difference between two media content itemsand a higher value may indicating an increasing degree of difference.The similarity score may be quantized into levels, for exampleA/B/C/D/E, for reporting the requester. The similarity score may becompared to a predetermined threshold and converted into a binary value,for example Yes/No, for reporting the requester.

At operation 216, the media delivery system 104 operates to sequence themedia content items based on the similarities. In some embodiments,where the aggregated track features are represented by numerical values,a difference between any two of the aggregated track features candetermine an order of the media content items. Such an order of themedia content items can begin with a seed media content item, which isselected from the media content items and to be played first among themedia content items. The seed media content item can be manuallyselected by the user, or automatically selected by the media deliverysystem 104 or the media playback device 102.

By way of example, when the seed media content item is given, the nextmedia content item can be selected to be a media content item having anaggregated track feature value that is more similar to the aggregatedtrack feature value of the seed media content item than to theaggregated track feature values of the other media content items. As asimple example of sequencing three media content items, a first mediacontent item is arranged prior to a second media content item and thesecond media content item is arranged prior to a third media contentitem when a difference between an aggregated track feature value of thefirst media content item and an aggregated track feature value of thesecond media content item being smaller than a difference between theaggregated track feature value of the first media content item and anaggregated track feature value of the third media content item.

FIG. 4 illustrates example track features 230, which can be used forsequencing media content items in a playlist. Several acoustic aspectsof a media content item can be exposed and combined differently fordifferent applications. In some embodiments, the track features 230include acoustic features 240, key and mode information 242, and tempo244.

In some embodiments, the track features 230 are computed for each trackin the media delivery system 104. In other embodiments, the trackfeatures 230 can be calculated using one or more software programsrunning on the media delivery system or one or more other computingdevices.

The acoustic features 240 represent the sound of a media content item,such as timbre, melody, pitch, harmony, and other possible features. Insome embodiments, the acoustic features 240 can be obtained from theacoustic metadata of the media content item.

In some embodiments, a timbre feature 250 is used as an example of theacoustic features. The timbre feature 250 is character or quality of asound or voice as distinct from its pitch and intensity. A timberfeature is a perceived sound quality of a musical note, sound, or tonethat distinguishes different types of sound production, such as choirvoices, and musical instruments, such as string instruments, windinstruments, and percussion instruments.

The key and mode information 242. The mode generally refers to a type ofscale, coupled with a set of characteristic melodic behaviors. The keyof a piece is a group of pitches or scale upon which a music compositionis created. The group features a tonic note and its correspondingchords, also called a tonic or tonic chord, providing a subjective senseof arrival and rest and also has a unique relationship to the otherpitches of the same group, their corresponding chords, and pitches andchords outside the group. Notes and chords other than the tonic in apiece create varying degrees of tension, resolved when the tonic note orchord returns. The key may be in the major or minor mode.

The tempo 244 indicates the speed or pace of a given piece or subsectionthereof, how fast or slow. Tempo is related to meter and is usuallymeasured by beats per minute, with the beats being a division of themeasures, though tempo is often indicated by terms which have acquiredstandard ranges of beats per minute or assumed by convention withoutindication.

In other embodiments, any other features or aspects of a media contentitem can be additionally or alternatively used as the track features230. The methods of using the track features 230 for sequencingdescribed herein are also applicable to such other features and aspectsused as the track features 230.

FIG. 5 illustrates an example method 270 for obtaining weighting data,which can be used at the operation 206 in the method 200 as described inFIG. 3. The method 270 is described herein with further reference toFIG. 6, which illustrates an example user interface for receiving a userinput of weighting.

In this example, the method 270 is described as being performed in themedia delivery system 104. However, in other embodiments, only some ofthe processes in the method 270 can be performed by the media deliverysystem 104. In other embodiments, all or some of the processes in themethod 270 are performed by the media playback device 102. In yet otherembodiments, all or some of the processes in the method 270 areperformed by both of the media delivery system 104 and the mediaplayback device 102 in cooperation. In yet other embodiments, the method270 can be performed by other computing devices and provided to themedia delivery system 104.

The method 270 is used to receive a manual selection of weights from auser. The weights can be determined based on the overall characteristics(such as styles, genres, moods, themes, similar artists and/or songs,and rankings) of the media content items in the playlist. The weightsare used to scale a plurality of track features such that the mediacontent items are sequenced and played to provide a smooth, continuousplayback. The given media content items may have similar values in oneor more particular track features, and thus the weights can be adjustedto emphasize such particular track features more than other trackfeatures which are not shared by all or a majority of the media contentitems. By way of example, where a playlist includes media content itemsgenerally suitable for a dance party, consistency in tempo and key maybe important aspects to preserve rhythmic regularity and harmonic flowbetween the media content items. In this case, the weights can beadjusted or set to give more weight on the tempo and the key feature.

The method 270 can begin at operation 272, in which the media deliverysystem 104 operates to provide a user interface for receiving a userinput of weights on track features. The user interface enables a user toinput or adjust weights for one or more track features 230. An exampleof the user interface is illustrated in FIG. 6. As shown in FIG. 6, theuser interface 280 provides one or more control elements, such assliders 282, allowing a user to make adjustments to values of trackfeatures 230. In some embodiments, the user interface can be presentedon a computing device which is connected to the media delivery system104 and operated by the user, and the computing device can transmit theinput to the media delivery system 104 once receiving the input from theuser through the user interface.

At operation 274, the media delivery system 104 operates to receive auser input of weights on one or more track features 230. In someembodiments, where the user inputs the weighting values through a usercomputing device, the media delivery system 104 receives such inputsfrom the user computing device. In other embodiments, the media deliverysystem 104 can directly receive the user input of weights from the user.

FIG. 7 illustrates another method 290 for obtaining weighting data,which can be used at the operation 206 in the method 200 as described inFIG. 3. In this example, the method 270 is described as being performedin the media delivery system 104. However, in other embodiments, onlysome of the processes in the method 270 can be performed by the mediadelivery system 104. In other embodiments, all or some of the processesin the method 270 are performed by the media playback device 102. In yetother embodiments, all or some of the processes in the method 270 areperformed by both of the media delivery system 104 and the mediaplayback device 102 in cooperation. In yet other embodiments, the method270 can be performed by other computing devices and provided to themedia delivery system 104.

The method 290 is used to automatically determine the weights forscaling the track features 230. The method 290 can begin at operation292, in which the media delivery system 104 obtains sequencing historydata. The sequencing history data include information about a history ofsequencing media content items in general. In some embodiments, thesequencing history data include a large volume of past sequencing eventsthat have been performed by music professionals, such as professionalmusic curators and disc jockeys. In other embodiments, the sequencinghistory data include a large volume of past sequencing events that havebeen performed by at least some users or listeners of the media contentitems provided by the media delivery system 104.

At operation 294, the media delivery system 104 operates to determinethe sequencing history of the given media content items based on thesequencing history data. In some embodiments, the media delivery system104 can identify a particular characteristic of the selected mediacontent items to be sequenced. Given a set of media content items, theset of media content items (for example, in the form of a playlist) canbe characterized to have a particular attribute in common, such asstyles, genres, moods, themes, similar artists and/or songs, rankings,etc. The media delivery system 104 can then determine how the same mediacontent items, or the media content items having a similarcharacteristic to the characteristic of the selected media contentitems, have been sequenced from the sequencing history data. The mediadelivery system 104 further determine a correlation between thesequencing history and the track features of the same media contentitems or the media content items having the similar characteristic. Sucha correlation can be used to determine or predict how the track features230 of the media content items to be sequenced should be weighted.

At operation 296, the media delivery system 104 can predict weights onthe track features of the media content items to be sequenced, dependingon the characteristic of the media content items.

FIG. 8 illustrates an example table 300 for showing the track features230 and the aggregated track feature 302 for each media content item116. In this example, the track features 230 and the aggregated trackfeature 302 are represented with numerical values. In some embodiments,the values of the track features and the aggregated track feature can benormalized.

In some embodiments, the aggregated track feature 302 are obtained as aweighted sum of the track features 230, such as the timbre feature 250,the key and mode feature 242, and the tempo feature 244. In the exampletable 300, the track features 230 are weighted such that the tempofeature 244 is only considered without the other track features (i.e.,Timber:Key/Mode:Tempo=0:0:1).

FIG. 9 illustrates an example table 310 in which the media content items116 are sequenced based on the aggregated track features 302. In thisexample, the media content items are arranged from the seed mediacontent item 310 (in this example, Track ID 1117) and ordered based onthe smallest difference method between the aggregated track features ofadjacent media content items, as used at the operation 216 describedwith reference to FIG. 3. In other embodiments, other methods can beused to order the given media content items.

FIG. 10 illustrates another example method 330 for automaticallysequencing media content items. In this example, the method 200 isdescribed as being performed in the media delivery system 104 includingthe media content sequence determination engine 112. However, in otherembodiments, only some of the processes in the method 200 can beperformed by the media delivery system 104. In other embodiments, all orsome of the processes in the method 200 are performed by the mediaplayback device 102. In yet other embodiments, all or some of theprocesses in the method 200 are performed by both of the media deliverysystem 104 and the media playback device 102 in cooperation.

At least some of the operations in the method 330 are performedsimilarly to the corresponding operations in the method 200 as describedwith reference to FIGS. 3-9. Therefore, the description of suchoperations in the method 200 is incorporated by reference for the method330.

The operations 332, 334, 336, and 338 are performed similarly to theoperations 202, 204, 206, and 208 in the method 200. For brevitypurposes, the description of the operations 332, 334, 336, and 338 areomitted.

At operation 340, the media delivery system 104 calculates featurevectors 376 (FIG. 11) for each media content item 116. At operation 342,the media delivery system 104 calculates an aggregated feature vector378 (FIG. 11) for each media content item 116. An example of suchcalculations is illustrated and described in more detail with referenceto FIG. 11.

At operation 344, the media delivery system 104 operates to compare theaggregated feature vectors 378 of each pair of the media content items116. At operation 346, the media delivery system 104 then determinessimilarities between the aggregated feature vectors 378. At operation348, the media delivery system 104 determines a sequence of the mediacontent items 116 based on the determined similarities. An example ofthe operations 344, 346, and 348 is described in more detail withreference to FIGS. 14 and 15.

FIG. 11 is a diagram 370 illustrating operations in the method 330. Inthis example, the media delivery system 104 includes a vector mappingengine 372 and an aggregation engine 374. The diagram 370 is describedwith also reference to FIG. 12, which illustrates an example mapping ofkey and mode information to Euclidean space, and FIG. 13, whichillustrates an example octave-invariance mapping of tempo to Euclideanspace.

In some embodiments, the vector mapping engine 372 and the aggregationengine 374 are included in the media content sequence determinationengine 112. In other embodiments, the vector mapping engine 372 and theaggregation engine 374 can be included in any other part of the mediadelivery system 104. In yet other embodiments, the vector mapping engine372 and the aggregation engine 374 can be included in the media playbackdevice 102 or any other computing devices.

The vector mapping engine 372 can refer to the track features 230 ofeach media content item 116 and associate them to corresponding featurevectors 376 in Euclidean spaces. In other embodiments, however, at leastone of the feature vectors 376 can be generated from other data whichare not directly related to corresponding track features.

Where the acoustic features 240 are concerned, in some embodiments, thevector mapping engine 372 can derive acoustic vectors from aconvolutional neural network. A convolutional neural network is a typeof feed-forward artificial neural network. One example of theconvolutional neural network that can be utilized to obtain the acousticvectors is described in Aaron Van den Oord, Sander Dieleman, andBenjamin Schrauwen. Deep Content-Based Music Recommendation. In Advancesin Neural Information Processing Systems, pages 2643-2651, 2013.

In some examples, one of the acoustic vectors can capture a timbralcharacteristic (such as the timbre feature 250) of a media content item.According to a convolutional neural network, a low-dimensional embedding(such as in an eight (8) dimensional space (R⁸)) can be trained in asupervised setting to minimize the Euclidean distance between similarmedia content items based on metadata information. In other examples,other approaches can be used to generate one or more acoustic vectors.

Where the key and mode feature 242 is concerned, in some embodiments,the vector mapping engine 372 can map the key and mode information 242into points 390 in a three (3) dimensional space (R³) so that adjacentkeys in the circle of fifths and relative major/minor keys areequidistance, as illustrated in FIG. 12. The points 390 can berepresented as a key and mode feature vector. In other embodiments, afeature vector for the key and mode information can be generated inother methods.

Where the tempo feature 244 is concerned, in some embodiments, thevector mapping engine 372 can map the tempo feature 244 in a binarylogarithmic scale. For example, in certain applications, tempo-octaveinvariance is preserved, and tempo is represented as a unit vector whosepolar angle is mapped into a tempo octave, as illustrated in FIG. 13. Inother embodiments, a feature vector for the tempo feature can begenerated in other methods.

In some embodiments, the above description about mapping of the acousticfeatures 240, the key and mode feature 242, and the tempo feature 244can required for a particular purpose or application. In otherembodiments, however, any number of dimensions and/or any type ofmapping can be used.

Referring still to FIG. 11, the aggregation engine 374 operates togenerate an aggregated feature vector 378 for each media content itembased on the feature vectors 376. In some embodiments, the aggregatedfeature vector 378 can be constructed by concatenating the individualfeature vectors 376. In some embodiments, the feature vectors 376 can bescaled based on the weighting data 380. In the illustrated example, theweighting data 380 is used in the aggregation engine 374 to scale thefeature vectors 376 to generate the aggregated feature vector 378.Alternatively, the weighting data 380 can be provided to the vectormapping engine 372 so that the track features 230 are scaled based onthe weighting data 380 before the feature vectors 376 are constructed.

FIG. 14 illustrates an example method 410 for determining similaritiesbetween media content items, which can be used at the operations 344,346, and 348 in the method 330 as described in FIG. 10.

At operation 412, the media delivery system 104 operates to calculate adistance (such as the Euclidean distance) between the aggregated featurevectors 378 of each pair from the media content items 116. Calculationof the distance between two aggregated feature vectors 378 is repeatedfor all possible pairs from the media content items 116 in the playlist.In other embodiments, any other distance measurement can be used tocalculate a distance between two aggregated feature vectors.

At operation 414, the media delivery system 104 determines a seed mediacontent item 310 (such as shown in FIG. 8), which is to be played firstamong the media content items. As described herein, the seed mediacontent item 310 can be manually selected by the user, or automaticallyselected by the media delivery system 104 or the media playback device102.

At operation 416, the media delivery system 104 determines a sequencebetween the media content items in the playlist based on the distancescalculated at the operation 412. The sequence begins from the seed mediacontent item 310. In a simple example, a first media content item isarranged prior to a second media content item and the second mediacontent item is arranged prior to a third media content item when adistance between an aggregated feature vector of the first media contentitem and an aggregated feature vector of the second media content itemis smaller than a distance between the aggregated feature vector of thefirst media content item and an aggregated feature vector of the thirdmedia content item. Other example sequencing methods are furtherdescribed with reference to FIG. 15.

FIG. 15 illustrates an example method 430 for determining a sequence ofmedia content items, which can be used at the operation 416 in themethod 410 as described in FIG. 14. The method 430 can be described withfurther reference to FIG. 16, which is an example graph 450 fordetermining the sequence of media content items.

In this example, the sequence of media content items is modeled with agraph traversal problem and determined using a graph which representsthe media content items to be sequenced and the similarities between themedia content items.

At operation 432, the media delivery system 104 operates to generate agraph 450 (FIG. 16) for representing the track features of the mediacontent items in the playlist. In some embodiments, as illustrated inFIG. 16, the graph 450 (G=(V, E)) is a complete symmetric graph having aplurality of vertices (V) 452 (including 452A-J) and a plurality ofedges (E) 454. In the illustrated examples, the graph 450 has eight (8)vertices 452A-J connected through the edges 454. In some embodiments,the graph 450 is a directed graph, in which edges have orientations. Inother embodiments, the graph 450 is an undirected graph, in which edgeshave no orientation.

In the graph 450, the vertices 452 correspond with the media contentitems 116 to be sequenced, respectively. Where the graph 450 issymmetrical, the positions of the media content items 116 with respectto the vertices 452 are irrelevant.

Each of the edges 454 connecting the vertices 452 (i.e., the mediacontent items) can have a property representative of the similaritybetween two media content items connected via that edge. In someembodiments, each of the edges 454 represents a distance (e.g.,Euclidean distance) between the aggregated feature vectors 378 of twomedia content items connected via that edge. In the illustrated exampleof FIG. 16, the property of each edge 454 can be depicted as thethickness of the edge. In one example, a thicker edge between twovertices can indicate that a distance between the aggregated featurevectors of two media content items corresponding to the vertices iscloser than another distance, and, therefore, that the two media contentitems are more similar than another pair of media content items. Inanother example, a thicker edge between two vertices can indicate that adistance between the aggregated feature vectors of two media contentitems corresponding to the vertices is further than another distance,and, therefore, that the two media content items are less similar thananother pair of media content items. In other embodiments, the propertyof each edge 454 can be represented as a numerical value annotated withthat edge. Other forms for representing the properties of edges are alsopossible.

At operation 434, the media delivery system 104 identifies a seed vertex456. The seed vertex 456 is a vertex associated with the seed mediacontent item 310. When determining an optima path in subsequentoperations, the seed vertex 456 is used as a starting point.

At operation 436, the media delivery system 104 determines an optimalpath 460 (dotted lines in FIG. 16) that visits all the vertices 452(such as 452A-J) only once. The optimal path can be a route consistingof the edges 454 that connect all the vertices 452 exactly once at alower total cost. A total cost of a path can be determined based on theproperty of the edges in the path. As described herein, the property ofan edge includes a distance between two vertices connected via thatedge, which is indicative of a similarity between two media contentitems corresponding to the two vertices. Therefore, the total cost of apath can be a sum of distances between adjacent vertices in that path.

In some embodiments, the optimal path is found using the shortestHamiltonian path approach. In other embodiments, the optimal path can befound using the shortest Hamiltonian cycle approach. As the Hamiltonianpath problem and the Hamiltonian cycle problem are both NP-complete,approximation approaches are used to find the shortest paths at theoperation 436. In one example, a straight forward greedy approximationcan be used, which iteratively selects the closest non-visited vertex,starting from the seed vertex. In another example, an improvement to thestraight forward greedy approximation can be made by selecting theclosest non-visited vertex from either the tail or the head of thepartial sequencing.

As the edges are weighted by the Euclidean distance between thecorresponding media content item features (e.g., the aggregated featurevectors) in constructing the graph 450, the total cost of sequencing canbe a sum of all the weights of the edges in the path.

At operation 438, the media delivery system 104 determines a sequence ofthe media content items based on the optimal path 460. When the optimalpath 460 is determined at the operation 436, the media content items canbe arranged in the same order as the corresponding vertices 452 alongthe calculated optimal path 460.

As such, the operations 436 and 438 are configured to determine anoptimal path that visits each vertex 452 exactly once. Accordingly, suchan optimal path among the vertices 452 can give an optimal order of themedia content items 116.

Referring now to FIGS. 17 and 18, in certain examples, the system of thepresent disclosure can be used to play back a plurality of media contentitems to continuously support a user's repetitive motion activitywithout distracting the user's cadence.

Users of media playback devices often consume media content whileengaging in various activities, including repetitive motion activities.As noted above, examples of repetitive-motion activities may includeswimming, biking, running, rowing, and other activities. Consuming mediacontent may include one or more of listening to audio content, watchingvideo content, or consuming other types of media content. For ease ofexplanation, the embodiments described in this application are presentedusing specific examples. For example, audio content (and in particularmusic) is described as an example of one form of media consumption. Asanother example, running is described as one example of arepetitive-motion activity. However, it should be understood that thesame concepts are equally applicable to other forms of media consumptionand to other forms of repetitive-motion activities, and at least someembodiments include other forms of media consumption and/or other formsof repetitive-motion activities.

The users may desire that the media content fits well with theparticular repetitive activity. For example, a user who is running maydesire to listen to music with a beat that corresponds to the user'scadence. Beneficially, by matching the beat of the music to the cadence,the user's performance or enjoyment of the repetitive-motion activitymay be enhanced. This desire cannot be met with traditional mediaplayback devices and media delivery systems.

FIG. 17 illustrates an example system 1000 for managing a sequencebetween media content items to continuously support a repetitive motionactivity. In some embodiments, the system 1000 is configured similarlyto the system 100 as described herein. Therefore, the description forall the features and elements in the system 100 are incorporated byreference for the system 1000. Where like or similar features orelements are shown, the same reference numbers will be used wherepossible. The following description for the system 1000 will be limitedprimarily to the differences from the system 100.

In the system 1000, the media playback device 102 further includes acadence-acquiring device 1114, as well as the media content sequencingengine 110. Also shown are a user U who is running. The user U'supcoming steps S are shown as well. A step represents a single strike ofthe runner's foot upon the ground.

The media playback device 102 can play media content for the user basedon the user's cadence. In the example shown, the media output 108includes music with a tempo that corresponds to the user's cadence. Thetempo (or rhythm) of music refers to the frequency of the beat and istypically measured in beats per minute (BPM). The beat is the basic unitof rhythm in a musical composition (as determined by the time signatureof the music). Accordingly, in the example shown, the user U's stepsoccur at the same frequency as the beat of the music.

For example, if the user U is running at a cadence of 180 steps perminute, the media playback device 102 may play a media content itemhaving a tempo equal to or approximately equal to 180 BPM. In otherembodiments, the media playback device 102 plays a media content itemhaving a tempo equal or approximately equal to the result of dividingthe cadence by an integer such as a tempo that is equal to orapproximately equal to one-half (e.g., 90 BPM when the user is runningat a cadence of 180 steps per minute), one-fourth, or one-eighth of thecadence. Alternatively, the media playback device 102 plays a mediacontent item having a tempo that is equal or approximately equal to aninteger multiple (e.g., 2×, 4×, etc.) of the cadence. Further, in someembodiments, the media playback device 102 operates to play multiplemedia content items including one or more media content items having atempo equal to or approximately equal to the cadence and one or moremedia content items have a tempo equal or approximately equal to theresult of multiplying or dividing the cadence by an integer. Variousother combinations are possible as well.

In some embodiments, the media playback device 102 operates to playmusic having a tempo that is within a predetermined range of a targettempo. In at least some embodiments, the predetermined range is plus orminus 2.5 BPM. For example, if the user U is running at a cadence of 180steps per minute, the media playback device 102 operates to play musichaving a tempo of 177.5-182.5 BPM. Alternatively, in other embodiments,the predetermined range is itself in a range from 1 BPM to 10 BPM. Otherranges of a target tempo are also possible.

Further, in some embodiments, the media content items that are playedback on the media playback device 102 have a tempo equal to orapproximately equal to a user U's cadence after it is rounded. Forexample, the cadence may be rounded to the nearest multiple of 2.5, 5,or 10 and then the media playback device 102 plays music having a tempoequal to or approximately equal to the rounded cadence. In yet otherembodiments, the media playback device 102 uses the cadence to select apredetermined tempo range of music for playback. For example, if theuser U's cadence is 181 steps per minute, the media playback device 102may operate to play music from a predetermined tempo range of 180-184.9BPM; while if the user U's cadence is 178 steps per minute, the mediaplayback device 102 may operate to play music from a predetermined temporange of 175-179.9 BPM.

Referring still to FIG. 17, the cadence-acquiring device 1114 operatesto acquire a cadence associated with the user U. In at least someembodiments, the cadence-acquiring device 1114 operates to determinecadence directly and includes one or more accelerometers or othermotion-detecting technologies. Alternatively, the cadence-acquiringdevice 1114 operates to receive data representing a cadence associatedwith the user U. For example, in some embodiments, the cadence-acquiringdevice 1114 operates to receive data from a watch, bracelet, foot pod,chest strap, shoe insert, anklet, smart sock, bicycle computer, exerciseequipment (e.g., treadmill, rowing machine, stationary cycle), or otherdevice for determining or measuring cadence. Further, in someembodiments, the cadence-acquiring device 1114 operates to receive acadence value input by the user U or another person.

FIG. 18 illustrates an example of the media delivery system 104 of FIG.17 for managing a sequence between media content items to continuouslysupport a repetitive motion activity. In the system 1000, the mediadelivery system 104 further includes a media server 1200 and arepetitive-motion activity server 1202. The media server 1200 includesthe media server application 150, the processing device 152, the memorydevice 154, and the network access device 156, as described herein.

In at least some embodiments, the media server 1200 and therepetitive-motion activity server 1202 are provided by separatecomputing devices. In other embodiments, the media server 1200 and therepetitive-motion activity server 1202 are provided by the samecomputing devices. Further, in some embodiments, one or both of themedia server 1200 and the repetitive-motion activity server 1202 areprovided by multiple computing devices. For example, the media server1200 and the repetitive-motion activity server 1202 may be provided bymultiple redundant servers located in multiple geographic locations.

The repetitive-motion activity server 1202 operates to providerepetitive-motion activity-specific information about media contentitems to media playback devices. In some embodiments, therepetitive-motion activity server 1202 includes a repetitive-motionactivity server application 1220, a processing device 1222, a memorydevice 1224, and a network access device 1226. The processing device1222, memory device 1224, and network access device 1226 may be similarto the processing device 152, memory device 154, and network accessdevice 156 respectively, which have each been previously described.

In some embodiments, repetitive-motion activity server application 1220operates to transmit information about the suitability of one or moremedia content items for playback during a particular repetitive-motionactivity. The repetitive-motion activity server application 1220includes a repetitive-motion activity interface 1228 and arepetitive-motion activity media metadata store 1230.

In some embodiments, the repetitive-motion activity server application1220 may provide a list of media content items at a particular tempo toa media playback device in response to a request that includes aparticular cadence value. Further, in some embodiments, the mediacontent items included in the returned list will be particularlyrelevant for the repetitive motion activity in which the user is engaged(for example, if the user is running, the returned list of media contentitems may include only media content items that have been identified asbeing highly runnable).

The repetitive-motion activity interface 1228 operates to receiverequests or other communication from media playback devices or othersystems to retrieve information about media content items from therepetitive-motion activity server 1202. For example, in FIG. 2, therepetitive-motion activity interface 1228 receives communication 184from the media playback engine 146.

In some embodiments, the repetitive-motion activity media metadata store1230 stores repetitive-motion activity media metadata 1232. Therepetitive-motion activity media metadata store 1230 may comprise one ormore databases and file systems. Other embodiments are possible as well.

The repetitive-motion activity media metadata 1232 operates to providevarious information associated with media content items, such as themedia content items 170. In some embodiments, the repetitive-motionactivity media metadata 1232 provides information that may be useful forselecting media content items for playback during a repetitive-motionactivity. For example, in some embodiments, the repetitive-motionactivity media metadata 1232 stores runnability scores for media contentitems that corresponds to the suitability of particular media contentitems for playback during running. As another example, in someembodiments, the repetitive-motion activity media metadata 1232 storestimestamps (e.g., start and end points) that identify portions of amedia content items that are particularly well-suited for playbackduring running (or another repetitive-motion activity).

Each of the media playback device 102 and the media delivery system 104can include additional physical computer or hardware resources. In atleast some embodiments, the media playback device 102 communicates withthe media delivery system 104 via the network 106.

In at least some embodiments, the media delivery system 104 can be usedto stream, progressively download, or otherwise communicate music, otheraudio, video, or other forms of media content items to the mediaplayback device 102 based on a cadence acquired by the cadence-acquiringdevice 1114 of the media playback device 102. In accordance with anembodiment, a user U can direct the input to the user interface 144 toissue requests, for example, to playback media content corresponding tothe cadence of a repetitive motion activity on the media playback device102.

The media mix data generation engine 1240 operates to generate media mixdata to be used for sequencing and/or crossfading cadence-based mediacontent items. As described herein, such media mix data can beincorporated in repetitive-motion activity media metadata 1232.

In this example, the media content sequencing engine 110 operates toarrange selected media content items (such as ones in a playlist) insuch an order that the media content items are played on the mediaplayback device 102 to continuously support a user's repetitive motionactivity without interruption or jarring effect.

In this document, for the purpose of determining track features orfeature vectors, calculating an aggregated track feature or aggregatedfeature vector, or determining similarity between two media contentitems or tracks, a media content item or a track may indicate the entiremedia content item or the entire track, a portion of the media contentitem or a portion of the track, or a collection of media content itemsor a collection of tracks, such as an album or a playlist.

The various examples and teachings described above are provided by wayof illustration only and should not be construed to limit the scope ofthe present disclosure. Those skilled in the art will readily recognizevarious modifications and changes that may be made without following theexamples and applications illustrated and described herein, and withoutdeparting from the true spirit and scope of the present disclosure.

What is claimed is:
 1. A method for playing media content items, themethod comprising: determining a plurality of track features of each ofthe media content items; obtaining weighting data for the plurality oftrack features; generating a plurality of weighted track features foreach of the media content items by applying the weighting data to theplurality of track features of each of the media content items;calculating aggregated track features for the media content items,respectively, based on the plurality of weighted track features;comparing the aggregated track features to determine similaritiesbetween the aggregated track features; and determining a sequence of themedia content items based on the similarities.
 2. The method of claim 1,further comprising: receiving a selection of the plurality of mediacontent items.
 3. The method of claim 1, further comprising: obtaining aplaylist identifying the plurality of media content items.
 4. The methodof claim 1, wherein obtaining weighting data comprises: receiving a userinput of weights on the plurality of track features.
 5. The method ofclaim 1, wherein obtaining weighting data comprises: obtainingsequencing history data; determining a sequencing history of the mediacontent items based on the sequencing history data; and predictingweights on the plurality of track features of the media content items.6. The method of claim 1, wherein the plurality of track featuresincludes acoustic features, key and mode information, and tempo.
 7. Themethod of claim 1, wherein the aggregated track features are representedby numerical values.
 8. The method of claim 1, wherein determining asequence of the media content items comprises: arranging a first mediacontent item prior to a second media content item, the first mediacontent item and the second media content item being selected from themedia content items, and the first media content item being playedbefore the second media content item; and arranging the second mediacontent item prior to a third media content item, the third mediacontent item being selected from the media content item and played afterthe second media content item, a difference between a numerical value ofan aggregated track feature of the first media content item and anumerical value of an aggregated track feature of the second mediacontent item being smaller than a difference between the numerical valueof the aggregated track feature of the first media content item and anumerical value of an aggregated track feature of the third mediacontent item.
 9. The method of claim 1, further comprising: identifyinga seed media content item selected from the media content items, theseed media content item sequenced to be played first among the mediacontent items.
 10. The method of claim 8, wherein the first mediacontent item is identified as a seed media content item, the seed mediacontent item sequenced to be played first among the media content items.11. A method for sequencing media content items, the method comprising:determining a plurality of track features of each of the media contentitems; weighting the plurality of track features; mapping the pluralityof weighted track features of each of the media content items to anaggregated feature vector; determining similarities among the aggregatedfeature vectors; and determining a sequence of the media content itemsbased on the similarities.
 12. The method of claim 11, whereindetermining similarities comprises: calculating distances between theaggregated feature vectors.
 13. The method of claim 12, whereindetermining a sequence of the media content items comprises: arranging afirst media content item prior to a second media content item, the firstmedia content item and the second media content item being selected fromthe media content items, and the first media content item being playedbefore the second media content item; and arranging the second mediacontent item prior to a third media content item, the third mediacontent item being selected from the media content item and played afterthe second media content item, a distance between a feature vector ofthe first media content item and a feature vector of the second mediacontent item being smaller than a distance between the feature vector ofthe first media content item and a feature vector of the third mediacontent item.
 14. The method of claim 11, wherein determiningsimilarities comprises: generating a complete symmetric graph withvertices and edges, the vertices associated with the media contentitems, respectively, and connected via the edges, the edges havingvalues representative of distances between the aggregated featurevectors of the media content items; and determining an optimal pathcrossing all of the vertices, the optimal path used to determine thesequence of the media content items.
 15. The method of claim 14, furthercomprising: identifying a seed vertex from the vertices, the seed vertexassociated with one of the media content items to be played first amongthe media content items.
 16. The method of claim 14, wherein the optimalpath includes a route defined by at least some of the edges and visitingall the vertices only once.
 17. The method of claim 14, wherein theoptimal path is calculated using the shorted Hamiltonian path.
 18. Themethod of claim 11, further comprising: obtaining a playlist identifyingthe media content items.
 19. The method of claim 11, further comprising:receiving a user input of weights on the plurality of track features.20. A computer readable storage device storing data instructions thatwhen executed by a processing device causes the processing device to:determine a plurality of track features of each of the media contentitems; weight the plurality of track features; map the plurality ofweighted track features of each of the media content items to anaggregated feature vector; determine similarities among the aggregatedfeature vectors; and determine a sequence of the media content itemsbased on the similarities.
 21. A system comprising: at least oneprocessing device; and at least one computer readable storage devicestoring data instructions, which when executed by the at least oneprocessing device, cause the at least one processing device to:determine a plurality of track features of each of the media contentitems; weight the plurality of track features; map the plurality ofweighted track features of each of the media content items to anaggregated feature vector; determine similarities among the aggregatedfeature vectors; and determine a sequence of the media content itemsbased on the similarities.